Automatic Discovery of Preservation Alternatives Supported by Community Maintained Knowledge Bases
نویسندگان
چکیده
Preservation Planning, which deals with selecting the most appropriate preservation action to be applied to digital objects, is an important step in any digital preservation activity. Comprehensive Preservation Planning depends on the availability of identified alternatives of preservation actions, which are for example file format migrations to migrate data in an outdated format to one that has better support. Also emulation, e.g. of the behaviour of a specific software application (application emulation), can be a viable preservation action. The alternative identification step can either be performed manually by an expert, or (semi-)automatically, if appropriate knowledge bases are available. Building and maintaining such knowledge bases is however a tedious task, as the number of software applications and file formats, and especially their relation to each other, is very large. In this paper, we therefore present an approach to automatically build knowledge bases for Preservation Planing from already existing, open resources. One such source is the community maintained Freebase, which contains linked data on many topics, among them file formats, software applications, and most importantly, their relations, in a structured manner. We demonstrate the applicability of these knowledge bases by automatically identifying possible digital preservative actions on a uses case, an eScience experiment from the domain of data mining. This use case originates from the task of process preservation, where we look beyond single files, but regard complete chains of executions as the objects to be preserved.
منابع مشابه
The Computer Revolution in Science: Steps Towards the Realization of Computer-Supported Discovery Environments
The tools that scientists use in their search processes together form so-called discovery environments. The promise of artificial intelligence and other branches of computer science is to radically transform conventional discovery environments by equipping scientists with a range of powerful computer tools including large-scale, shared knowledge bases and discovery programs. We will describe th...
متن کاملOverview of Configurators as Effective Tools for Corporate Knowledge Management
Knowledge management is widely discussed, but scarcely supported with tools and information systems. Configurators capture knowledge from product design, marketing and sales. By that configurators support three of the core processes of knowledge management: distribution, usage and preservation of knowledge. Constraint-based configurators and other model-based reasoning systems provide the techn...
متن کاملTools for knowledge acquisition within the NeuroScholar system and their application to anatomical tract-tracing data
BACKGROUND Knowledge bases that summarize the published literature provide useful online references for specific areas of systems-level biology that are not otherwise supported by large-scale databases. In the field of neuroanatomy, groups of small focused teams have constructed medium size knowledge bases to summarize the literature describing tract-tracing experiments in several species. Desp...
متن کاملAutomatic Preservation Watch using Information Extraction on the Web
The ability to recognize when digital content is becoming endangered is essential for maintaining the long-term, continuous and authentic access to digital assets. To achieve this ability, knowledge about aspects of the world that might hinder the preservation of content is needed. However, the processes of gathering, managing and reasoning on knowledge can become manually infeasible when the v...
متن کاملAutomatic Discovery of Technology Networks for Industrial-Scale R&D IT Projects via Data Mining
Industrial-Scale R&D IT Projects depend on many sub-technologies which need to be understood and have their risks analysed before the project can begin for their success. When planning such an industrial-scale project, the list of technologies and the associations of these technologies with each other is often complex and form a network. Discovery of this network of technologies is time consumi...
متن کامل